This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCAD.2020.3003843, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems
TABLE V: Experimental results on DAC 2012 benchmarks [41] for routability-driven placement.
†
RePlAce
DREAMPlace (40 threads)
Runtime (s)
DREAMPlace (RTX 2080TI)
Runtime (s)
Runtime (s)
LG
Design
#nodes
#nets
sHPWL
RC
GP
sHPWL
RC
GP
sHPWL
RC
GP
NL GR
DP
Total
LG
DP
Total
LG
DP
Total
NL
GR
NL
GR
SB2
SB3
SB6
SB7
SB9
SB11
SB12
SB14
SB16
SB19
1014K
920K
1014K
1365K
847K
955K
1293K
635K
699K
523K
991K
898K
1007K
1340K
834K
936K
1293K
620K
697K
512K
62.39
30.69
31.30
37.20
21.48
34.28
26.69
21.26
25.57
14.21
102.47
100.81
100.61
101.13
101.09
102.65
103.02
100.75
102.29
101.05
6981
2354
1874
2068
1866
2676
3040
740
2168
969
548
438
369
549
441
188
539
257
46
70
44
54
23
28
153
22
16
17
160
149
144
201
148
108
230
87
9382
3565
2634
2794
2426
3385
3898
1052
2331
1685
61.06
30.18
30.92
36.73
21.21
32.86
26.90
21.25
25.42
15.10
101.57
100.73
100.26
100.60
100.61
100.86
101.25
100.55
101.77
103.28
3953
3306
1888
963
1200
524
309
144
87
214
319
146
119
108
30
16
27
23
11
24
5
15
2
1
183
172
168
234
170
125
278
104
105
126
5390
4040
2414
1395
964
1602
3398
1345
891
61.20
30.18
31.00
36.73
21.23
32.80
26.38
21.24
25.53
14.67
101.76
100.65
100.33
100.61
100.65
100.79
100.72
100.51
101.94
102.73
293
131
169
78
1215
485
309
143
83
283
398
148
115
57
31
16
27
23
11
24
5
15
2
1
184
182
168
233
171
125
278
108
106
133
1746
835
694
509
337
603
883
349
283
274
677
54
1218
2767
1067
649
150
171
65
44
71
1669
1288
91
110
701
948
ratio
1.010
1.005
21.6
2.7
6.2
0.8
5.4
1.004
1.001
14.0
1.1
1.0
1.0
3.4
1.000
1.000
1.0
1.0
1.0
1.0
1.0
Both results for RePlAce and DREAMPlace are collected from a Linux machine with two 20-core Intel Xeon Gold 6230 CPUs (40 cores in total) and 1 NVIDIA RTX 2080TI GPU.
We obtain the binary of RePlAce [8] to keep consistent experimental settings for this benchmark suite. As the RePlAce binary uses float32 for nonlinear placement, we use the same setting for DREAMPlace
in this experiment. The binary also only supports single-thread and the external global router NCTUgr is also single-thread.
†
ACKNOWLEDGE
[13] T. Lin, C. Chu, J. R. Shinnerl, I. Bustany, and I. Nedelchev, “POLAR: A
high performance mixed-size wirelengh-driven placer with density con-
straints,” IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems (TCAD), vol. 34, no. 3, pp. 447–459, 2015.
[14] M.-C. Kim, D.-J. Lee, and I. L. Markov, “SimPL: An effective
placement algorithm,” IEEE Transactions on Computer-Aided Design
of Integrated Circuits and Systems (TCAD), vol. 31, no. 1, pp. 50–60,
The authors would like to thank Mr. Lutong Wang and Mr.
Ilgweon Kang from the University of California at San Diego
for preparing the RePlAce binary, suggestions on the experimental
setups, and verifying the results.
This project is supported in part by NVIDIA.
2
012.
15] M.-C. Kim, N. Viswanathan, C. J. Alpert, I. L. Markov, and S. Ramji,
MAPLE: multilevel adaptive placement for mixed-size designs,” in
ACM International Symposium on Physical Design (ISPD). IEEE,
012, pp. 193–200.
16] T. Lin, C. Chu, and G. Wu, “POLAR 3.0: An ultrafast global placement
engine,” in IEEE/ACM International Conference on Computer-Aided
Design (ICCAD). IEEE, 2015, pp. 520–527.
17] W. Li, Y. Lin, and D. Z. Pan, “elfPlace: Electrostatics-based placement
for large-scale heterogeneous fpgas,” in IEEE/ACM International Con-
ference on Computer-Aided Design (ICCAD). Westminster, CO: IEEE
Press, November 2019.
[
REFERENCES
“
[
1] A. B. Kahng, S. Reda, and Q. Wang, “Architecture and details of a
high quality, large-scale analytical placer,” in IEEE/ACM International
Conference on Computer-Aided Design (ICCAD). IEEE, 2005, pp.
2
[
[
8
91–898.
[
[
[
2] T. Chan, J. Cong, and K. Sze, “Multilevel generalized force-directed
method for circuit placement,” in ACM International Symposium on
Physical Design (ISPD). ACM, 2005, pp. 185–192.
3] A. B. Kahng and Q. Wang, “A faster implementation of APlace,” in
ACM International Symposium on Physical Design (ISPD). ACM,
006, pp. 218–220.
4] T.-C. Chen, Z.-W. Jiang, T.-C. Hsu, H.-C. Chen, and Y.-W. Chang,
NTUplace3: An analytical placer for large-scale mixed-size designs
[
[
[
20] A. Ludwin, V. Betz, and K. Padalia, “High-quality, deterministic parallel
placement for FPGAs on commodity hardware,” in ACM Symposium
on FPGAs. ACM, 2008, pp. 14–23.
2
“
with preplaced blocks and density constraints,” IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems (TCAD),
vol. 27, no. 7, pp. 1228–1240, 2008.
[
21] W. Li, M. Li, J. Wang, and D. Z. Pan, “UTPlaceF 3.0: A parallelization
framework for modern FPGA global placement,” in IEEE/ACM Interna-
tional Conference on Computer-Aided Design (ICCAD). IEEE, 2017,
pp. 922–928.
[
5] M.-K. Hsu, Y.-F. Chen, C.-C. Huang, S. Chou, T.-H. Lin, T.-C. Chen,
and Y.-W. Chang, “NTUplace4h: A novel routability-driven placement
algorithm for hierarchical mixed-size circuit designs,” IEEE Transac-
tions on Computer-Aided Design of Integrated Circuits and Systems
[
[
[
22] J. Cong and Y. Zou, “Parallel multi-level analytical global placement
on graphics processing units,” in IEEE/ACM International Conference
on Computer-Aided Design (ICCAD). ACM, 2009, pp. 681–688.
23] C.-X. Lin and M. D. Wong, “Accelerate analytical placement with GPU:
A generic approach,” in IEEE/ACM Proceedings Design, Automation
and Test in Eurpoe (DATE). IEEE, 2018, pp. 1345–1350.
24] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “PyTorch: An imper-
ative style, high-performance deep learning library,” in Conference on
Neural Information Processing Systems (NeurIPS). Curran Associates,
Inc., 2019, pp. 8024–8035.
(
TCAD), vol. 33, no. 12, pp. 1914–1927, 2014.
[
[
6] J. Lu, P. Chen, C.-C. Chang, L. Sha, D. J.-H. Huang, C.-C. Teng,
and C.-K. Cheng, “ePlace: Electrostatics-based placement using fast
fourier transform and nesterov’s method,” ACM Transactions on Design
Automation of Electronic Systems (TODAES), vol. 20, no. 2, p. 17, 2015.
7] J. Lu, H. Zhuang, P. Chen, H. Chang, C. Chang, Y. Wong, L. Sha,
D. Huang, Y. Luo, C. Teng, and C. Cheng, “ePlace-MS: Electrostatics-
based placement for mixed-size circuits,” IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems (TCAD),
vol. 34, no. 5, pp. 685–698, 2015.
[
25] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
[
[
8] C. Cheng, A. B. Kahng, I. Kang, and L. Wang, “RePlAce: Advancing
solution quality and routability validation in global placement,” IEEE
Transactions on Computer-Aided Design of Integrated Circuits and
Systems (TCAD), vol. 38, no. 9, pp. 1717–1730, 2019.
9] Z. Zhu, J. Chen, Z. Peng, W. Zhu, and Y.-W. Chang, “Generalized
augmented lagrangian and its applications to VLSI global placement,”
in ACM/IEEE Design Automation Conference (DAC). IEEE, 2018, pp.
in International Conference on Learning Representations (ICLR), 2015.
[26] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning.
MIT press Cambridge, 2016, vol. 1.
[27] M.-K. Hsu, Y.-W. Chang, and V. Balabanov, “TSV-aware analytical
placement for 3D IC designs,” in ACM/IEEE Design Automation
Conference (DAC). ACM, 2011, pp. 664–669.
[28] M.-K. Hsu, V. Balabanov, and Y.-W. Chang, “TSV-aware analytical
placement for 3-D IC designs based on a novel weighted-average
wirelength model,” ACM/IEEE Design Automation Conference (DAC),
vol. 32, no. 4, pp. 497–509, 2013.
[29] W. C. Naylor, R. Donelly, and L. Sha, “Non-linear optimization system
and method for wire length and delay optimization for an automatic
electric circuit placer,” Oct. 9 2001, US Patent 6,301,693.
[30] Y. Lin, S. Dhar, W. Li, H. Ren, B. Khailany, and D. Z. Pan, “DREAM-
Place: Deep learning toolkit-enabled textGPU acceleration for modern
VLSI placement,” in ACM/IEEE Design Automation Conference (DAC).
ACM, 2019, pp. 1–6.
1
–6.
[
[
[
10] N. Viswanathan, M. Pan, and C. Chu, “FastPlace 3.0: A fast multilevel
quadratic placement algorithm with placement congestion control,” in
IEEE/ACM Asia and South Pacific Design Automation Conference
(
ASPDAC). IEEE, 2007, pp. 135–140.
11] X. He, T. Huang, L. Xiao, H. Tian, and E. F. Y. Young, “Ripple: A
robust and effective routability-driven placer,” IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems (TCAD),
vol. 32, no. 10, pp. 1546–1556, 2013.
12] T. Lin, C. Chu, J. R. Shinnerl, I. Bustany, and I. Nedelchev, “PO-
LAR: placement based on novel rough legalization and refinement,”
in IEEE/ACM International Conference on Computer-Aided Design
[31] K. A. Berman and J. Paul, Fundamentals of Sequential and Parallel
(
ICCAD). IEEE, 2013, pp. 357–362.
Algorithms, 1st ed. Boston, MA, USA: PWS Publishing Co., 1996.
0
278-0070 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Exeter. Downloaded on June 27,2020 at 06:13:12 UTC from IEEE Xplore. Restrictions apply.
升级至LINER Premium,并撰写无限数量的评论。